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Abstract 

In this paper, we apply the Schwarz Waveform Relaxation (SWR) method to 
the one dimensional Schrodinger equation with a general linear or a nonlinear 
potential. We propose a new algorithm for the Schrodinger equation with time 
independent linear potential, which is robust and scalable up to 500 subdo¬ 
mains. It reduces significantly computation time compared with the classical 
algorithms. Concerning the case of time dependent linear potential or the non¬ 
linear potential, we use a preprocessed linear operator for the zero potential case 
as preconditioner which lead to a preconditioned algorithm. This ensures high 
scalability. Besides, some newly constructed absorbing boundary conditions are 
used as the transmission condition and compared numerically. 

Keywords: Schrodinger equation, Schwarz Waveform Relaxation method. 
Absorbing boundary conditions. Parallel algorithms. 


1. Introduction 

Schwarz waveform relaxation method (SWR) is one class of the domain 
decomposition methods for time dependent partial differential equations. The 
time-space domain is decomposed into subdomains. The solution is computed on 
each subdomain for whole time interval and exchange the time-space boundary 
value. Some articles are devoted to this method for linear Schrodinger equation 
BH], advection reaction diffusion equations Emis], wave equations BE] and 
Maxwell’s equation [3. 

This paper deals with the SWR method without overlap for the one di¬ 
mensional Schrodinger equation defined on a bounded spatial domain (00,60)1 
ao,6o G K and t G ( 0 ,T). The Schrodinger equation with homogeneous Neu- 
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mann boundary condition reads 

( ■= {idt + + 'y)u = 0, (t, x) e (0,T) x (oq, &o), 

< u(0,a:) = ■uo(x), a; e (ao,foo), (1) 

[ dnu{t,x) = 0, X = ao,bo, 

where ^ is the Schrodinger operator, i9„ is the normal directive, the initial value 
uq G L^(M) and is a real potential. We consider both linear and nonlinear 
potentials: 

1. r = y(t,x), 

2. 'f = f{u), ex. y = |up. 

In order to perform domain decomposition method, the time-space domain 
(0, T) X (oo, 6 o) is decomposed into N subdomains Qj = (0, T) x flj, Q.j = (o^, bj) 
without overlap as shown in Figure^ for N = 3. 


t 





n2 


n2 

(o,r) X Oi 

(0. T) X O 2 

(O.T) X 03 


Qq = ai bi — 02 &2 — 03 ^2 — bg 


Figure 1: Domain decomposition without overlap, N = S. 


The classical SWR algorithm consists in applying the sequence of iterations 
for j = 2,3,...,iV-l 


= (t,x)G0j, 

u^''‘^(0,x) = Mo(a;), xG^lj, 

X = Oj, 

x = bj. 


( 2 ) 


The two extremal subdomains require special treatment since the Neumann 
boundary condition is imposed in 0 at the points Og and fog. 


' = 0 , (t, x) € 01 , 

Mj''‘^(0,x) = Uo{x),X G fli, 

^niOi’’’^ = 0,X = Oi, 

= Biu^,x = bi, 


' =0,it,x)G&N, 

u^+^(0,x) = Ug(x),x G ilff, 
Bnu’^^ = Bnu%_^,x = Oat, 
= 0 ,x = bN- 


The notation denotes the solution on subdomain 0j = (0,T) x {aj,bj) at 
iteration k = 0,1,2,... of the SWR algorithm. The boundary information is 


2 















transmitted with adjacent subdomains 0j-i and 0j+i through the transmission 
operators Bj. 

The transmission condition is one of the key issues for this method. For 
the linear Schrodinger equation, the SWR method with or without overlap is 
introduced and analyzed by Halpern and Szeftel in [T]. For the decomposition 
without overlap, if is a constant, they use an optimal transmission condi¬ 
tion given by the underlying transparent boundary condition. However, the 
transparent boundary condition is not always available for a variable poten¬ 
tial. Robin transmission condition and quasi-optimal transmission condition 
are therefore used and are named as optimized Schwarz waveform relaxation al¬ 
gorithm and quasi-optimal Schwarz waveform relaxation algorithm respectively. 
In both cases, the transmission operator is written as 

Bj=dn^+S,, (3) 


where the operator Sj is 


Robin : Sj = —ip, p G Quasi-optimal : Sj = —idt — RUj.bjj 

and rij denotes the outwardly unit normal vector at aj or bj. Recently, Antoine, 
Lorin and Bandrauk consider the general Schrodinger equation. On the 
interface between subdomains, they propose to use recent absorbing conditions 
as transmission condition, which is also an idea that we follow in this paper. 

In recent years, some absorbing operators for one dimensional Schrodinger 
equation have been constructed by using some adaptations of pseudo-differential 
techniques uni Enin [13]. We use them here as the transmission operators in 
^ and expect to get good convergence properties. 

We are also interested in this article about the effectivness of the method on 
parallel computers. Another import issue for the method is therefore the seal- 
ability. As we know, without additional considerations, the more subdomains 
are used to decomposed (ao,&o)j the more iterations are required for SWR al¬ 
gorithm to reach convergence. Thus, the total computation time could hardly 
decrease significantly. In this paper, we propose two solutions: a new scalable 
algorithm if the potential is independent of time and a preconditioned algorithm 
for general potentials. 

This paper is organized as follows. In section 2, we present the transmission 
conditions which are used in this paper for the classical SWR algorithm, and 
the discretization that plays an important role for the analyses of the interface 
problem in Section 3. In Section 4 and 5, we present the new algorithm for 
time independent linear potential and the preconditioned algorithm for general 
potentials. Some numerical results are shown in Section 6. Finally, we draw a 
conclusion in the last section. 
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2. SWR algorithm and discretization 

2.1. Transmission conditions 

The transmission conditions on boundary points aj and bj are given thanks 
to the relation 

Bj = d^. + Sj , (4) 

where the operators Sj could take different forms. Besides the Robin transmis¬ 
sion condition, we propose in this paper to use the operators Sj coming from 
the artificial boundary conditions for Q defined in dsmnnanii for a linear or 
nonlinear potential T'(t,x,u). The authors propose three families of conditions 
written as 

dnU + Si^u = 0 , 

on the boundary of considered computation domain, M denotes the order of 
the artificial boundary conditions. We index by I these families of boundary 
conditions: I — 0 for potential strategy, I = 1 for gauge change strategy and 
/ = 2 for Fade approximation strategy. We recall here the definition of operators 
S^^ for the different strategies. 


= 0(113) 



Order 2 : 

o2 

*^0 

= e-*^dy^ 

Order 3 : 

o3 

*^0 

-52_ 

Order 4 : 

o4 

*^0 

_ 53 I. 

— ^0 * 4 -'*) 


-ds, 


1 /2 

where the fractional half-order derivative operator d^. ^ applied to a function h 
is defined by 

al'Mt) = f ^ 

Vtt Jq yjt- 
1 /2 

the half-order integration operator ' and the integration operator are given 
by 

h{s) 


V Jo 


_ -ds, Ithit) = / h{s)ds. 

Vt-s Jo 

Gauge change strategy I = 1 1 |111 IT^ i 

Order 2 : S^ = 

Order 4 : St = Sf - zsgn(dn'r)^^^b^e^^(*’"^Jt( 
where sgn{-) is the sign function and 

V{t,x) = f Y{s,x,u{s,x))ds. 

Jo 
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Fade approximation strategy I = 2 ([nma) 


Order 2 : S'! = idt + , 

Order 4 : S\ = Sl + sgn(5„r) (jd, + r) . ) . 

2.2. Discretization 

The aim of this subsection is to present the discretization of the Schrodinger 
equation with a linear potential Y = V(t,x) or a nonlinear potential Y = f{u). 


2.2.1. Case of linear potential 

First, we describe the discretization of the linear Schrodinger equation. 
We discretize the time interval (0, T) uniformly with intervals and define 
At = T/Nt to be the time step. A semi-discrete approximation adapted to the 
Schrodinger equation on (0,T) x {aj,bj),j = 1,2,...,N is given by the semi¬ 
discrete Crank-Nicolson scheme 


• ^j,n ^j,n — l 


At 


+ dx 


Y Vn + Ki-1 '^j,n + 


= 0, l^nifNT, 


and UjQ= it(0, x) for x G (oj, bj). The unknown function Uj.^{x) is an approxi¬ 
mation of the solution u^{nAt, x) to the Schrodinger equation at time = nAt 
on subdomain D,j and at iteration k. We define the approximation of the po¬ 
tential Vn{x) = V{tn,x). 

For implementation issue, it is useful to introduce new variables = 
+ ^j,n-i)/2 with Vj Q = u’- Q. The scheme could be written as 


2i^ +d 

+ Ox 


+ = 2i 


At ' 


(5) 


with Wn = (Vn + Vn-i)f2. The spatial approximation is realized thanks to a 
classical Pi finite element method. The use of transmission condition gives the 
following boundary conditions for each subdomain 


dn,V^,n + Sv^,n 


X — Uj, 
+ X = bj, 


( 6 ) 


with special treatments for the two extreme subdomains 


= 0, a: = oi, = 0, a; = b^, 


where S' is a semi-discretization of S. For each strategy, S is given by 
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Potential strategy I = 0 


Order 

2 : 

- 

SqV. 

Order 

3 : 

7)1 

0 CO 

Order 

4 : 

SqV. 


At 


s=0 


^2 


^3 


2 2 


s=0 


dn.WnAt 
^ 4 2 


y~! In-sVj^s 


s=0 


where 


(oq j oi, 02,03, 04, 05 j-.-) 4 6^'^^ ^ ^ ^ 

(70,71,72,73, •••) = ( 1 , 2 , 2 , 2 ,...). 

Gauge change strategy I = 1 

Order 2: J—^ 

^ s^O 

Order 4: 

2 




s=0 


where Wn = 


V„+V„_i 


and V„(a;) = /p" P(s,x)(is. 


2 c-xiVA - jp 

Fade approximation strategy I = 2 


^kn=-*(E 


aTHr. + ^ 




s=0 


s=l 


* 1 + dr 


2 z 

_At_ s 

s=l At ^ ^ “fc 




-^4 ^ -^2 ^ 1 


4 ^ J_ W ■!’” 

At ^ 


+ S5^('9n,W"„)-^- 2» , w ^1’"-^’ 

^ At + 

where 4>j^n, s = 1, 2,..., m are introduced as auxiliary functions 


^ + 2 . , w , 5 = 1 , 2,...,771 


^0^-^-h ^+W^+d^'^j,n^ 2^+n/„+d™V^i,n-l 


= 0 , 
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and 






At 


1 ; 

Zit • - 

V'i.n “ ‘^'4’j,n—^ ~ V'j.n—1) 

tjjjfi = 0 . 

We also recall here the Robin transmission condition and its approximation 


S = Sp = -ip, = Spv^^ = -ip ■ r;" 


p e 


We propose below to rewrite ([^ by using fluxes, which are deflned at inter¬ 
faces by 


7^ = d 


(a,) + Svl^{a,), rt = j = 1, 2,..., N, 


with the exception for li n — n ~ obvious that, on each subdomain, 

the boundary conditions are 


9n,v'^.n + = I 




J.J , 


+ Svl„ = rl^, x = bj. 


(7) 


For the transmission condition S'q, 5'g, Si and S'! which do not contain the 
normal derivative of potential Wn , using ^ , we have 


T'l.n = + -S’Uj,„(&l) = „ (os) -f (os) 

- (5nW2,;'(«2) +^<;'(a2)) + 2Sv^,-\a2) = + 2Sv^,-\a2). 

The transmission conditions could therefore be rewritten as 

f lln = 0, + 2Sv^zl„ib,.,), J = 2,..., IV, 


In = 0, = -l-+tn + ^Sv^+U^j+i), J = 1, 2,..., N-1. 


( 8 ) 


Dealing with the transmission conditions 5 'q, Sf and S 2 , we could also obtain 
similar formulas to We can therefore replace the boundary conditions (§ 
for the N local problems by (0 and fluxes definition 

Let us denote by (resp. the nodal Pi interpolation vector of 

(resp. vfj n) with Nj nodes, Mj the mass matrix, Sj the stiffness matrix and 
the generalized mass matrix with respect to Wnv4>dx, j = 1, 2, N. 
Thus, the matrix formulation of the N local problems is given by 


2i 


(A,,„ - - Qj iC r'l.y , 


(9) 


where Aj^n = — Sj -I- Mj^rv^ and is the standard notation of the 

transpose of a matrix or a vector. The restriction matrix Qj is deflned by 


Qj — 


1 0 0 
0 0 0 


0 0 
0 1 


G C 


2xNi 
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S (resp. ) represent the boundary matrix (resp. vector) 

associated with the boundary condition at time step n, which depends on the 
transmission condition. The discrete form of the transmission condition ([^ is 
given by 


= -'^3-1,n + j 

= -lj + l,n + 25(Qj + l,iV^^+i „), j 


2, 3, ...,7V. 


( 10 ) 


where Qjj = (1,0, • • • , 0,0) € , Qj^r = (0,0, • • • , 0,1) S . S is the fully 

discrete version of S. For example the transmission condition Sq leads to 


S^0iQj,rv';^n) 



s=0 



Pn-s{Qj,r''fj^s)- 

s =0 


2.2.2. Case of nonlinear potential 

If the potential is nonlinear Y = f{u), we propose to use the usual scheme 
developed by Duran- Sanz Serna m 




At 


+dx 


u 


'j.n-l 






= 0, 1 ^ ^ TVt’, 


By using the notations defined in the previous subsection, this schema reads as 




( 11 ) 


As in the previous subsection, we use a Pi finite element method to deal with the 
space variable approximation. Since the problem is nonlinear, the computation 
of Vj.^ is made by a fixed point procedure. At a given time t = tn, we take 
Cj = and compute the solution as the limit of the iterative scheme 

with respect to s: 


I^M,- - S, - ] 
At ^ ^ 


2i . 


= —I 
At 




( 12 ) 


where ^jj{v) is the vector associated with f(v)v(j)dx. The matrix and 
the vector depend on the transmission operator. The discrete form of the 
transmission conditions is similar to (101 obtained for linear potential. 


3. Interface problem 

The TV problems ([^ and (121 on each subdomain could be written globally. 
Let us define the global interface vector at iteration k by 

A ... P 7 ^ ... P P 

y \ ' 1,If ' 1,2^’•’■>'1 ,Nt ^ f ^j,If •’••> ^j,NT'^ j,j,NT'> f If 2f’"f ^N,Nt ) ' 

s-,,-^ N^ ^ 

7=1 7 


j=N 














Considering the transmission conditions with flux (101, it is not hard to see that 
there exist an operator TZ such that 

5'=+l=7^5^ (13) 

The interface operator TZ is linear or nonlinear depending on the linearity of Y. 
We focus below on the interface problem for the linear potential 'f' = V{t,x), 
especially for 'f' = V{x). 

For the transmission conditions presented in Section we are going to show 
that if y = V{t,x), then 

= TZg^ = + d, 


(14) 


where £ is a block matrix 

/ 


£ = 


X 


1,4 


X2.1 

^2,2 



\ 



X3.1 

X3.2 


^2.3 

^2,4 






X3.3 

X3.4 



V 


j^N-1,1 j^N-1,2 


j^N-1,3 j^N-1,4 


X 


N,1 




with G j = 1, 2, N, p = 1, 2,3,4 and d is a vector 

d= e dpi, dy^ S 

It is easy to see that the formula is equivalent to 
1 . for j = 1, 


(15) 

(16) 


ik+l 

Lr) ry 


(id \ 


= X^d 


^ 1,2 




2 . for j = 2,. 




Vi,Nt/ 



f ^ 

= X^d 

‘k ^ 

S'.2 

+ X^d 

I- ) 


,X+i i 

/ \ 

[ 1 

S + 1,2 

= X^d 

f \ 

S.2 

+ X^d 

^3,2 






V^NtJ 


+ dj — 




( 17 ) 


^ 1 + 1 , 


9 





















3. for j = N, 


( «.i \ 

' N-1,2 




^ \ 




/fc 

‘at ,2 


Vn,Nt/ 


dw-i 


Proposition 1. For the transmission condition involving the operator Sq, in 
the case of linear potential 'F = V{t,x), if we assume that the matrices Aj^n — 
n = 1,2, .■.,Nx are not singular, then the N equations could he written 
in the global form of interface problem 


r,k+l 


g-^= Cg^ + d. 
Proof. First, according to , we have 


^-.1 - 1 = 0 + e " Q 


-®..nK„ = 


= -.n-a + e " J -^Qj Y, P 2 -gQj^iq - Qj {l^n, 


At 


2i 


„-i+ e- 


“J 


2 

At 

~2 

At 


n —1 

Qj Y - Qj {lY rU 


9=0 
n —1 


At 

n —1 


At 

4z 


At 


9=0 


= E + e-'/y 


9=1 


+ ((-ir-'^M, +e—/y|^/3„QjQ,)u,-o - Qjil^dnf^ ^ ^ 2 , 


where we recall that 'Vjq = Uj^o- Thus, we could see that 


v|,„ = - (A,-„ - ®,.{lY vis 

n—1 


+ (A,,„ - + e—/y-/3„_,QjQ,.)v^,, 

(18) 


9=1 


+ (Aj_„ - ^((-1)” ^^ldnQjQj)uj,o- 

By induction on n, it is easy to see that is a linear function of Ij^ and 


r|g,s = l,2,...,n. Then considering the formulas ( |10[ ), in order to finish the 
proof, we need only verify that S{QjgVj^) and S{Qj^r'Vjn) ^-re linear functions 
of Vj ,.,3 = 1,2,..., n. 
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Proposition 2. For any transmission condition presented in Section^ assum¬ 
ing that the matrices Aj^n — n = 1,2,Nt are not singular, then the 

interface problem in the case of linear potential 'F = V(t, x) could be written in 
the global form ( I 4 )■ 


Proof. The proof is quite similar than that of the previous proposition. For 
each transmission condition, we only need to recalculate the expression of v^„. 

We now turn to the structure of sub-blocks for F = V(x) and j = 2,3,N— 

1 , 


and = {x^’^Ji^n,s^NT and For 5 time steps, 

this structure is described below 


h 



\ 


X 

'k 




0 

X 

k 


, Nt 

<1 

0 

X 

k 


\o 

<] 

0 

X xy 



thus, each sub-diagonal have an identical element. 


Proposition 3. For the transmission condition involving the operator Sq, if 
F = V(x) and assuming that „ — Bj n = 1, 2,..., Nx are not singular, then 
the matrices X^’'^ X^^, X^'^, X^^^ X^'^, j = 2,3, ...,N — 1 and X^’^ are lower 
triangular matrices and they satisfy 



_ 1,4 

“ ^n—l,s—1’ 



^n,s 

_ j,l 

~ ^n—l,s—1’ 

T,i.2 

‘^n,s 

_ jA 

^n—l,s—1 ’ 

^n,s 

_ j,3 

~ ^n—l,s —1’ 

rrjA 

'^n.s 

j,4 

= <’_!,s-l.J 

^n,s 

_ N,1 

^n—l,s—15 




2,3,...,N-1, 


for 2 ^ s ^ n ^ Nt ■ 


Proof. Without loss of generality, we consider here j = 2,3, ...,N — 1. First, 
we design 

j (Aj^n , q — 71, 

= I (A,.„ - B,-„)-i ,9 = 1,2,..., 77 - 1. 

If the linear potential X = V {x) is independent of time, then it is easy to see 


■^f,l ~ -^1,2 


■ • = Bj_i = Bj_2 
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Thus for 2 ^ s ^ n ^ Nt 


(19) 


Y3 = 

^n,s ^n—l,s—1' 


Then, according to (181, we have 


L = ^inQj {lln. rlS + E ^n,A<l + U..nU,, 0 - ( 20 ) 

9=1 


where U,-„ = (A,-„-B,.„)-i (§(-l)"-iMj■ By induc¬ 
tion, we can obtain an expression of 


A = E AQl (A Af + 

9=1 


( 21 ) 


where h^ g, q = 1,2, ...,n and Uj^n are matrix. For example, We 

are going to show that for 2 ^ s ^ n ^ Nt, 


^h,s — 


( 22 ) 


Replacing in (20) by (21), we have 


n— 1 


A =AnQ] (A Af + E A, (E HAJ (A <p)^ + t^i.9 w.o) + u,.„u 


1,9 W.O ) -I- 


9=1 P=1 


n—1 n—1 n—1 

=AnQj A Af + E ( E AAp) QJ (A Af + ( E A.Uj.g + U,,„) u,. 0 

P=1 9=P 9=1 


Comparing the above formula with (21), we have 


yi 

^ n,n') 

TT _ y "-1 
q^s 


l-'n—l.s—1 — 


^ n—l,n—1 ’ 
n—2 


E Yn- 1 . 9 ^,s-l >2 ^ s < n. 

9=3— 1 

(23) 


By using (191 and by induction on n, we get 

^n,n — ^n,n — ^n—l,n—1 — l-'n—l,n—1? 
n—1 n—1 n—2 

Ln,s = E/ ^".9^9.s “ E/ ^n-l,5-l^g-l,s-l = E/ ^ra-l,9^1s-l “ 2 ^ S < Tl. 

q—s q—s q—s—1 


The formula (22) is thus demonstrated. 
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Then we replace in the first two formulas of (101 by (211. We get 

^ 

/fe+l = — r^ 9p-*^/4. / _ (]^ W _|_ 

j.n ' y A j. / ^ Hn—p / ^ 7 \^j,q'i j,q) ' ^Hj,n 


p=i 5=1 


= -r^- 
3 


fc^ n n 

U + ^ E ( E /3n-p^,,) Qj {li,, rl^f + 


9=1 


(24) 


^fc+i _ 

S-i.™ 


jc) ^ ^ 

i*„ + 26-'-/^ Ai E «>.< (E -s-pi-L) of (iL. 4.,)’' + «b,. 


9=1 P=9 


where we denotes the terms that are independent of ^ and s = 1 , 2 ,..., Nt 
by remainder terms i?; ^ to make the proof more readable. 

Moreover, according to (171, we have 


Nt N'T 

/fc + l _ I 7 k +1 N,llk I 7 

h,n ~ / V l.s ~r ^2,Z,n; ^ AT-l.n ~ / > ^n.s ^AT.s ~r ^iV-l,r,n; 

s^l s^l 

A/'x’ A^t 

fc+i i.i/fc fe ij. 

‘ j-l,n ~ 2-^ ‘^n,s''j,s ^ '^n.s' 3 ,s ^ ^j-1,r.n-, 

s^l s^l 

ATj- A^x 

/fc+l _ ■'W i,3 7fe i'^.y.J.4 fe _|_ J 

s^l s^l 

where dj-i^i^n and dj+i^r,n denote the n-th element of dj-i,/ and respec¬ 

tively. 


Comparing the above formula with (241, we have for 1 ^ n < s ^ Nt, 


— .pj,2 _ „i,3 _ j,4 _ r, 

'^n.s ^n.s ‘^n.s '^n.s 


T 


and for 1 ^ s ^ n ^ Nt^ 

n n 

Xn!s = -1 + ScaQj,; ( E Qlh = 2c2Qj,t ( E /3’^-pH.s) Q 

p—s p—s 

n n 

— 2C2QjT ( E/ '^"■-P^P.s) ^n,s = —1 + 2c2QjT ( E/ '^’^“P^P.s) ^ 1 , 

p—s p—s 

where C2 = eT'"''^\ptt ^se Qj= Qj/j^q + Qj,r'^j,q- 
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Finally, using (|22|, we have for 2 ^ s ^ n ^ Nt, 

~2 

p—s 


= -1 + 2e-"/y — Q,,i ( ^ l3n-pH,s) Qh 

p—s 

riy~ ^ 

p—s 

= _l + 2e—/V—Q,-z( ^ /3„_i_,L^,,)gj; 


T _ 


p=s —1 


In the same way, we can prove that x{^^^ = s_i and 


A^,l 


i 4 7,4 

•^71,8 ~ ‘^ 71 - 1 , 8 - 1 * 

Proposition 4. With any transmission condition presented in Section i 
'f' = V{x) and assuming that Aj^n — n = l,2,...,iV7’ are not singular, 

then the matrices X^’^, , X^"^, X^^, X^^, j = 2, 3, ..., — 1 and X 

lower triangular matrices and they satisfy 

1,4 _ 1,4 

^n,s ^n—l,s—1’ 

j,l _ j,l j,2 _ j,2 

^n,s ^n—l,s—1’ ^n,s ^n—l,s—1’ 


J.3 _ 


^n,s l,s—1’ ‘^n 

Ar,l _ N,l 

^n,s l,s—17 


=x^^ti^s-id = 2,3,...,A^- 1, 


for 2 ^ s ^ n ^ Nt ■ 


Proof. The proof is similar to that of Proposition!^ We only need to recom¬ 
pute and ^ for each transmission condition. 

4. New algorithm for time independent linear potential 

The standard implementation of the SWR method for the time-independent 
equations leads to the following classical algorithm 

Algorithm 1: Classical algorithm 
1 : Initialize the iteration by g^, 

2 : Solve Schrodinger on each subdomain with g^. 

3: Exchange values at interfaces and compute g^~^^. 

4: Do again steps 2 and 3 until error < £, e 1. 


As we can see, the classical algorithm requires to solve K times the Schrodinger 
equation on each subdomain, where K corresponds to the number of iterations 
required to reach convergence. We are going to present a new algorithm for 
y = V{x) which is more efficient. As we will see, it will require to solve the 
Schrodinger equation on each subdomain only four times in total. This new 
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algorithm is equivalent to the classical algorithm, but it reduces significantly 
the calculations. 

Before giving this new algorithm, we could see that the classical algorithm 
is based on (13l: = TZg^, where the operator TZ includes the steps 2 and 


3. We have shown in Proposition that 


9 


fc+i 


= Tlg’^ = Hg^ 


d. 


(25) 


It is easy to see that (25) is nothing but the fix point method to solve the 
equation 

(/ - C)g = d. (26) 


A big advantage to interpret (25) as a fixed point method to solve (p^ is that 
we can use any other iterative methods to solve this linear system. So we can 
use Krylov methods (ex. Gmres, Bicgstab) [TB], which could accelerate the 
convergence prospectively. To use the Krylov methods or fixed point method, 
it is enough to define the application of / — £ to vector g by 


{I-C)g = I-ng + d. 


The classical algorithm could then be rewritten with 
Algorithm 2: Classical algorithm, version 2 


Build d = 72. • 0 in (261 explicitly. 


Define the application of / — £ to vector in (261, 


Solve the linear system (261 by an iterative method (fixed point or 
Krylov). 

Solve the Schrodinger equation on each subdomain for each time step 
using the boundary conditions obtained at step 3. 


If the fixed point method is used in Step 3, we recover the first version of 
the classical algorithm. The second version of the classical algorithm allows the 
use of Krylov methods to accelerate convergence. However, applying (/ — £) to 
vector g is still a very expensive operation. With the help of Propositions]^ and 
1 ^ we propose a new algorithm 


Algorithm 3: New algorithm 


1 : Build £ and d in (26) explicitly, 

2 : Solve (26) by an iterative method, 

3: Solve Schrodinger equation on each subdomain using the boundary 
conditions obtained at step 2. 


We show beloa the construction of the matrix £ and the vector d. As it 
will be seenn, their computation is not costly. Regarding the implementation, 
we then show how £ and d are stored for use of parallelism. Here, we use the 
PETSc library |17| . Using the matrix form in PETSc, the memory required for 
each MPI process |18| is independent of the number of subdomains. 
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4-.1. Construction of the matrix C and the vector d 

We use the formulas and ( [To| for the constructions. Numerically, we 
consider and r^^^ as inputs, and^^j^„ and „ as outputs: 


inputs: 


(lOl 


outputs: 


It is easy to see that 

d= {dl^^,dli,dl^^,--- ,d%i)'^ = 7^•0, 
where 0 is the zero vector. The elements of d are obtained by 

fc +1 \ / jk-\-l 


dj—l^r — 


\„fc+i / 

fc + l /fc + 1 




S + 1,2 


^ J +1, iV'T ' 


where the scalars r^_^ ^^+1 g, s = 1 , 2 , ...,A^ 7 ’ are given by the formula (101 
with 

lis = ds = 0,s = l,2,...,NT. 

The equation is solved numerically on each subdomain only one time. Note that 
this construction works for y = V{t,x). 

According to Propositions]^ andif iT = V(x), in order to build the matrix 
£, it is enough to compute the first columns of blocks X^’^, 

Xh'^, j = 2,3,...,X- 1 and X^’b 
The first column of X^^ is 

„fc+i 








Xdi 

0 

= (x^’i 

0 

+ X^’2 

0 


W 


Voj 


w 


,r) 


T dj—r 7 . ) dj — \ — 


y-i,r 


CjaA 

A-1,2 

\x+i / 


dj —l,r- 


The first column of X^’^ is 








Xd3 

0 

= (x^’3 

0 

+ X^'d 

0 


[ 0 ) 


loj 


w 


b+i.i 


/ \ 

[ W 1 

b+ 1,2 


~ dj+i j — 




-d 


■j+i.i- 




The scalars s = 1, 2,..., Xy are computed by the formula (^lOl with 

g = Vj g = 0 , s = 1 , 2 ,..., Nt except for Ij ^ = 1 . 

The equation is solved numerically only one time on the subdomain (oj, bj). 
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In the same way, the first columns of X^’’^ and are 



0 

= 

0 


0 


—l,r — 

M-1.2 

1 

1 


w 


[o) 


w 



j-l,NT' 



and 








XoA 

0 

= 

0 

+ x^'^ 

0 


loj 


loj 


w 


/ \ 

( \ 

S + 1.2 


V 


-d 




where the scalars s 

but with 


1 , 2 ,..., Nt are obtained by the formula (10 1 , 


Ij g = Vj g = 0, s = 1, 2,..., Nt except for r^ i — 1. 


The equation is solved numerically on each subdomain {aj, bj) only one time. 

In conclusion, it is sufficient to solve the equation © on each subdomain 
three times to construct explicitly the interface problem. The construction is 
inexpensive. In total, the equation Q is solved on each subdomain four times 
in the new algorithm. Numerically, we will compare the classical and the new 
algorithms in Section |0] 


4-2. Storage of the matrix C and the veetor d for massive parallel eomputing 

Thanks to the peculiar form of the matrix C, we can build it on parallel 
computers through an MPI implementation. The transpose of C is stored in 
a distributed manner using the library PETSc. As we can see below, the first 
block column of C is in MPI process 0. The second and third blocks columns are 
in MPI process 1, and so on for other processes. The consumed memory for each 
process is at most the sum of 4 blocks. The size of each block is Nt x Nt- Each 
block contain {Nt + 1) x Nt/^ non zero elements according to Propositions]^ 
and[T] 
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£ = 


/MPI 0 

MPI 

1 

MPI 

2 

MPI 

N-2 


^2.1 

X2.2 





Xl.4 



^3,1 

^3,2 




^2,3 

^2.4 

X3.3 

X3d 


XN-1,2 

V 





X^-1.3 



MPI Af-1\ 


/ 

(27) 

The vector d can also be stored in PETSc form. The first block is in MPI 
process 0, the second and the third are in MPI process 1, and so on. The last 
block is in MPI process iV — 1. Each MPI process contain at most 2 x Nt 
elements. 

jT \T 
^N,l J ■ 


^ ( ^l,r ) ^2,h ^2,n ‘ ‘ ‘ > ^j,r ) ’ ’ ’ ’ 


MPI 0 MPI 1 


MPI j - 1 


MPI N -1 


5. Preconditioned algorithm for general potentials 


In Section]^ we have established the interface problem (13) for Schrodinger 
equation with time dependent or nonlinear potential. However, it is not pos¬ 
sible to construct the interface matrix C without much computation since the 
Propositions and only hold for time independent linear potential. Thus, the 
new algorithm is not suitable here. Instead, to reduce the number of iterations 
required for convergence, we propose to add a preconditioner P~^ {P is a non 


singular matrix) in (131 which leads to the preconditioned algorithm: 


1. for y = V{t,x), 


=I-P-\l-n)g\ (28) 

P-\I - C) = p-^d, (29) 

2 . for y = f{u), 

(30) 

We now turn to explain which preconditioner is used. The interface problem 
for the free Schrodinger equation (without potential) is 


9 


k+l 


= /:o/ 


where the symbol Cq is used to highlight here the potential is zero. The trans¬ 
mission condition is the same as that for (Q. We propose for time dependent 
or nonlinear potential the preconditioner as 


P = I-Co. 
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We have two reasons to believe that this is a good choice. 

1. The matrix Cq can be constructed easily since a zero potential is indepen¬ 
dent of time. Therefore, the construction of £o only needs to solve the free 
Schrodinger equation two times on each subdomains. This construction is 
therefore scalable. 

2. Intuitively, the Schrodinger operator without potential is a roughly ap¬ 
proximating of the Schrodinger operator with potential: 

idt + dxx ~ idt -f f, 


thus 


P— I — Cq~I — P — I — Cq~I — {TZnl ~ Pnl ■ 0 )- 


Next, we present the application of preconditioner. The transpose of P is 
stored in PETSc form. For any vector y, the vector x := P~^y is computed by 
solving the linear system 

Px = {I — Co)x = y (31) 


We do not explicitly construct the matrix P~^ as the inverse of a distributed 
matrix numerically is too expensive. The linear system (31) is solved by the 
Krylov methods (Gmres or Bicgstab) initialized by zero vector using the library 
PETSc. We will see in Section [63] that the computation time for applying this 
preconditioner is quite small compared with the computation time for solving 
the Schrodinger equation on subdomains. 


6. Numerical results 

The physical domain (oq, 6 o ) = (~21, 21) is decomposed into N equal subdo¬ 
mains without overlap. We fix in this section the final time to T = 0.5, the time 
step to At = 0.001 and the mesh size to Ax = 10“® without special statement. 
The potentials that we consider in this part and the corresponding initial data 
are 

1 . time independent linear potential: Y = —x^, uo(x) = + 20 i{x+io)^ 

2 . time dependent linear potential: Y = btx, uo{x) = -1-202(2:-i-io)^ 

3. nonlinear potential: Y = |up, uo(x) = 2sech(-\/^(x -|- 

which give rise to solutions that propagates to the right side and undergoes 
dispersion. Since the matrices Mj, Sj and are both tri-diagonal sym¬ 

metric in one dimension, the consumed memory is low. It is thus possible to 
solve numerically the Schrodinger equation on the entire domain (0, T) x (oq, bo) 
with a standard machine. The modulus of solutions at the final time t = T are 
presented in Figure|^for Y = —x^ and Y = |up. 

We use a cluster consisting of 92 nodes (16 cores/node, Intel Sandy Bridge 
E5-2670, 32GB/node) to implement the SWR algorithms. We fix one MPI 
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Figure 2: |no,o| et |rio,jVi,| on (ao,bo), ^ = —x^ (left) and i' = |rip (right), At = 0.001, 
Ax = 10-5. 


process per subdomain and 16 MPI processes per node. The communications are 
handled by PETSc and Intel MPI. The linear systems ^ and (121 related to the 
Schrodinger equation are solved by the LU direct method using the MKL Pardiso 
library. The convergence condition for our SWR algorithm is || — 9^ ||< 

IQ- 10 ^ Two types of initial vectors are considered in this article. One is 
the zero vector, another is the random vector. According to our tests, the zero 
initial vector makes the algorithms to converge faster, but obviously it could not 
include all the frequencies. As mentioned in m, using the zero initial vector 
could give wrong conclusions associated with the convergence. Thus, the zero 
vector is used when one wants to evaluate the computation time, while the 
random vector is used when comparing the transmission conditions. 


6.1. Comparison of classical and new algorithms 

We are interested in this part to observe the robustness of the algorithms, to 
know whether they converge or not for the time independent potential = —x^. 
Similarly, we will observe the computation time and the high scalability of the 
algorithms. We denote by the computation time required to solve numer¬ 
ically on a single processor the Schrodinger equation on the entire domain and 
T®'® (resp. T"®'^) the computation time of the classical (resp. new) algorithm 
for N subdomains. We test the algorithms for N = 2, 10,100, 500,1000 subdo¬ 
mains with the transmission condition Sq. The reason for using S'g for these 
tests will be explained in Remark The initial vector here is the zero vector. 

First, the convergence history and the computation time for the algorithms 
are shown in Figure]^ and Table [T] where the fixed point method is used on the 
interface problem. The algorithms converge for 500 sub domains, but not for 
1000 sub domains. 

Next, we use the Krylov methods (Gmres or Bicgstab) on the interface prob¬ 
lem instead of the fixed point method. Table present the computation time. 
As we can see, the use of Krylov methods allows to obtain robust scalable SWR 
algorithms. The algorithms converge for 1000 subdomains and are scalable up 
to 500 subdomains. Besides their computation times are lower than the ones of 
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Figure 3: Convergence history, N = 2, 100, = —3^^, At = 0.001, Ax = 10 Fixed point. 


Table 1: Computation time in seconds, Y = —At = 0.001, Ax = 10 Fixed point. 


N 

2 

10 

100 

500 

^ref 

403.56 

2^cls 

773.07 

2937.77 

359.30 

284.78 

2^new 

773.72 

178.30 

18.19 

4.76 


Table 2: Computation time in seconds, Y = —x^, At = 0.001, Aa: = 10 Gmres and 
Bicgstab. 



N 

2 

10 

100 

500 

1000 



403.56 

Gmres 

j^cls 

771.82 

2577.51 

2249.54 

907.06 

739.65 

'j-’new 

777.42 

177.20 

18.95 

6.86 

8.17 

Bicgstab 

'J'cls 

rpnew 

774.19 

774.44 

2760.11 

177.02 

679.72 

18.18 

799.09 

6.83 

845.65 

7.12 
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the classical algorithm. Roughly speaking, in Table and Table we have 

X iViter + ..., 

T“'" = r,,bx4 + Tid + ..., 

where T^uh is the computation time for solving the equation on one subdomain, 
Tj^d is the computation time for solving the interface problem, represent 
the negligible part of computation time such as the construction of matrices for 
the finite element method. If the number of subdomains N is not so large, then 
Tsuh ^ TlcI and the minimum of iVuer is 3 in all our tests. If the number of 
subdomains N is large, then T^d ~ Tsub and IViter 4. It is for this reason 
that the new algorithm takes less computation time. However, as the number 
of subdomains increase, T^d becomes larger. Thus, the new algorithm loses 
scalability if the number of subdomains is large. 

In conclusion, the new algorithm with Krylov methods is robust and it takes 
much less computation time than the classical algorithm. 

6.2. Comparison of classical and preconditioned algorithms 

In this part, we are interested in observing the robustness, the computation 
time and the scalability of the preconditioned and non-preconditioned (classi¬ 
cal) algorithms for time dependent potential ^ = 5tx and nonlinear potential 
'C = jup. We denote by Npc the number of iterations required to obtain con¬ 
vergence with the preconditioned algorithm and Tpc the computation time of 
the preconditioned algorithm. The transmission condition used in this section 
is S'q. We use the zero vector as the initial vector qq. 

First, we present in Figure]^ the convergence history for Y = 5tx. If N is 
not large, then there is no big difference between the classical algorithm and the 
preconditioned algorithm. However, if N is large, then as at each iteration, one 
subdomain communicate only with two adjacent subdomains, we can see that 
the non-preconditioned algorithm converges very slowly in the first interations. 
The convergence of the preconditioned algorithm improves greatly since the 
preconditioner allows communication with remote subdomains. The number of 




Figure 4: Convergence history, N = 10,1000, V" = 5tx, At = 0.001, Ax = 10 ®. 
iterations required for convergence and the computation time are presented in 
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Table for = 10, iV = 100, N = 500 and N = 1000. We can see that 
the preconditioner allows to decrease significantly both the number of iterations 
and the computation time. The strong scalability of the classical algorithm is 
very low. Indeed, the number of iterations required increases with the number 
of subdomains. The preconditioned algorithm is much more scalable (up to 
500 subdomains). However, it loses scalability from N = 500 to N = 1000. 
There are two reasons. One is that the number of iterations required for N = 
1000 is a little bit more than that for N = 500. The other one is linked to 
the implementation of the preconditioner. Indeed, the time Tpc consists of 
three major parts: the application of TZ to vectors (step 1, denoted by Ti), 
the construction of the preconditioner (denoted by T^c) and the application of 
preconditioner (step 3, denoted by T 3 ). We have thereby 

Tpc « Ti + T^c + T 3 . (32) 


If N is not very large, Ti ^ T^c ^ Tb. By increasing the number of subdomains, 
Ti and T^c decreases and T 3 increases. Thus, if N is large, is not negligible 
compared to Ti and Tsc- However, it is not very convenient to estimate Ti 
and T 3 in our codes because we use the "free-matrix" solvers in the PETSc 
library. To confirm our explanation, we make tests using a coarser mesh in 
space (At = 0.001, Aa: = lO”'^). The size of the interface problem (131 is 
the same, thus T 3 should be similar to that of the previous tests (At = 0.001, 
Ax = 10“®). But the size of the problem on a subdomain is ten times smaller. 
Thus, Ti and T^c are both smaller. The preconditioned algorithm should be less 
scalable. The results are shown in Table It can be seen that the computation 
time Tpc for N = 1000 is larger than for N = 500 and the preconditioned 
algorithm is not very scalable from N = 100 to N = 500. Despite this remark, 
we could conclude from our tests that the preconditioned algorithm reduces a 
lot the number of iterations and the computing time compared to the classical 
algorithm. 


Table 3: Number of iterations required and computation time of the classical algorithm and 
the preconditioned algorithm, y = 6tx, At = 0.001, Ax = 10~^. 


N 

10 

100 

500 

1000 

-Nnopc 

17 

71 

349 

695 

Npc 

17 

32 

31 

35 

rpre? 

6496.3 

Tnopc 

10123.1 

3217.0 

2466.5 

2238.0 

Tpc 

10128.9 

1432.7 

250.0 

170.7 


Next, we reproduce the same tests for the nonlinear potential T = \u\‘^. The 
convergence history is presented in Figure We show the number of iterations 
and the computation time in Table The conclusions are quite similar. 

6 .3. Comparison of the transmission conditions 

In this part, we compare the transmission conditions which are presented 
in Section in the framework of the new algorithm for T = —x^ and the 
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Table 4: Number of iterations required and computation time of the classical algorithm and 
the preconditioned algorithm, 'f = 5tx, At = 0.001, Ax = 10“^. 


N 

10 

100 

500 

1000 

Nnopc 

17 

71 

349 

695 

Npc 

17 

32 

26 

25 

rpref 

507.5 

Tnopc 

681.9 

223.8 

210.2 

191.2 

Tpc 

694.3 

107.6 

38.4 

54.5 




Number of Iteration; 


Figure 5: Convergence history, N = 10,1000, "V = |iip, At = 0.001, Ax = 10 ®. 


Table 5: Number of iterations required and computation time of the classical algorithm and 
the preconditioned algorithm, 'f = |up. At = 0.001, Ax = 10“®. 


N 

10 

100 

500 

1000 

-Nnopc 

12 

71 

349 

694 

Afpc 

11 

22 

25 

26 

^ref 

3200.8 

Tnopc 

2582.3 

1332.2 

1248.0 

1129.7 

Tpc 

2446.7 

408.2 

117.6 

83.8 
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preconditioned algorithm for Y = |up. The theoretical optimal parameter p in 
the transmission condition Robin being not at hand, we seek the best parameter 
numerically. We use in the subsection the random vector as the initial vector 
Po to make sure that all frequencies are present. 

6.3.1. Case of linear potential 

We first consider the linear potential Y = —x'^. We compare the number 
of iterations, the total computation time to perform a complete simulation and 
the computation time required (Trd) to solve the interface problem in Table 
for N = 2 using the fixed point method, Gmres and Bicgstab methods on 
the interface problem. As can be seen, the total computation times are almost 
identical. The required computation time for solving the interface problem is 
relatively close to zero compared with the total computation time. Therefore, 
we are interested rather in the number of iterations. We can make the following 
observations 

1. the number of iterations required for the Robin transmission condition is 
greater compared to the other three strategies, 

2 . in each strategy, the number of iterations is not sensitive to order, 

3. for the Fade approximation strategy, the number of iterations decrease as 
the parameter of Fade (m) increase. 


Table 6: Comparison of transmission conditions for N = 2, V = —x^, At = 10 Ax = 10 


Fixed point 

Gmres 

Bicgstab 

Strategy 

Mter 

Txd 

^total 

Mter 

Txd 

^total 

Mter 

Txd 

^total 



6 

0.005 

775.7 

5 

0.002 

774.2 

3 

0.002 

773.8 



6 

0.002 

774.2 

5 

0.002 

779.6 

3 

0.002 

773.3 



6 

0.002 

769.0 

5 

0.002 

774.2 

3 

0.002 

773.6 


s? 

6 

0.002 

773.4 

5 

0.002 

773.2 

3 

0.002 

773.8 


Sf 

6 

0.002 

773.9 

5 

0.002 

773.6 

3 

0.002 

774.5 


q2,20 

^2 

191 

0.062 

773.3 

28 

0.010 

774.5 

16 

0.011 

773.1 


o 2,50 

*^2 

76 

0.025 

773.6 

27 

0.010 

773.3 

15 

0.010 

773.6 

sY 

c . 2,100 

*^2 

39 

0.013 

776.3 

23 

0.008 

775.2 

13 

0.009 

773.6 


e 4,20 

*^2 

181 

0.059 

769.9 

28 

0.010 

774.6 

15 

0.010 

773.6 


c - 4,50 

*^2 

77 

0.025 

776.0 

27 

0.010 

773.5 

15 

0.010 

773.3 


o 

o 

39 

0.013 

775.4 

23 

0.008 

773.8 

13 

0.009 

774.8 

Robin* 

1112 

0.360 

774.7 

47 

0.017 

776.4 

27 

0.018 

777.4 


* the parameters for the transmission condition Robin are p = 44 (fixed point), p = 5 
(Gmres) and p = 5 (Bicgstab). 


We make the same tests for N = 500, the results are shown in Table We 
could see that 
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1. in each strategy, the number of iterations is not sensitive to order, 

2. for the Fade approximation strategy, if the parameter m is small, then the 
algorithm is not robust, 

3. the Krylov methods (Gmres and Bicgstab) could not always reduce the 
number of iterations. 


Table 7: Comparison of transmission conditions for N = 500, V = At = 10 Ax = 

10-5. _ 



Fixed point 

Gmres 

Bicgstab 

Strategy 

Mter 

Thd 

^total 

-^iter 

Trd 

^total 

Mter 

Trd 

^total 



357 

0.775 

4.68 

1023 

2.883 

6.91 

368 

1.646 

5.51 



337 

0.734 

4.62 

977 

2.620 

6.55 

345 

1.831 

5.77 



337 

0.733 

4.65 

978 

2.681 

6.54 

350 

1.739 

5.73 

Sf 

s? 

341 

0.745 

4.62 

1010 

2.364 

6.20 

353 

2.102 

6.00 












Sf 

340 

0.743 

4.63 

1023 

3.454 

7.19 

351 

2.225 

6.06 


q 2,20 

^2 

- 



1240 

3.368 

7.34 

440 

2.626 

6.64 


c , 2,50 

*^2 

- 



997 

2.320 

6.30 

352 

2.240 

6.16 

sf 

^,100 

336 

0.735 

4.62 

998 

3.055 

7.03 

333 

1.603 

5.62 

e 4,20 

*^2 




1216 

3.349 

7.31 

464 

2.044 

6.05 


c - 4,50 

*^2 

- 



1043 

3.907 

7.85 

336 

1.756 

5.63 


o 

o 

336 

0.733 

4.60 

1024 

2.424 

6.35 

334 

1.989 

5.95 

Robin* 

1690 

3.628 

7.52 

1060 

3.000 

6.80 

318 

1.41 

5.32 


*: the parameters for the transmission condition Robin are p = 45 (fixed point), p = 19 
(Gmres) and p = 6 (Bicgstab). 

the algorithm does not converge before 2000 iterations. 


We could conclude that if the number of subdomains N is not very large, the 
potential strategy in order 2 with Bicgstab method on the interface problem is 
a good choice. If N is large, the Bicgstab method also allows most of the 
algorithms to converge, but it is difficult to have a general conclusion for the 
transmission conditions in the framework of new algorithm. 

6.3.2. Case of nonlinear potential 

Now we turn to compare the transmission conditions for the nonlinear poten¬ 
tial "C = |up in the framework of the preconditioned algorithm. First, we study 
the influence of the parameter p in the Robin transmission condition. The num¬ 
ber of iterations and the computation time are shown in Table It is clear that 
the convergence is not sensitive to this parameter. Next we compare the three 
strategies. The numericals results are presented in Table The transmission 
conditions Sq, Sf and S'! include the evaluation of f{u). We don’t And a suit¬ 
able discretization of this term such that the continuity of vj at the interfaces 
ensure the continuity of dn f(u). Thus we could not obtain the solution 
that satisfy Uj^n = RjUo,n- We could see that the number of iterations is not 
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Table 8: Influence of parameter p in the transmission conditions Robin, N = 2, 10,100, 
r = |u|2, At = 0.001, Ax = 10-"^. 




N 

= 2 

N 

= 10 

N = 

100 


p 

^iter 

^total 

NiteT 

^total 

Mter 

^total 


5 

9 

1042.6 

12 

257.9 

21 

55.7 


10 

8 

920.3 

11 

230.9 

22 

50.6 

Robin 

15 

8 

920.3 

11 

228.7 

22 

46.7 

20 

8 

914.5 

11 

226.1 

22 

43.7 


25 

8 

913.0 

11 

226.4 

22 

43.6 


30 

8 

919.2 

11 

227.6 

22 

43.9 


35 

8 

922.1 

11 

231.8 

22 

44.4 


40 

8 

922.8 

12 

250.2 

22 

45.0 


45 

8 

921.7 

12 

252.5 

22 

46.0 


50 

8 

928.3 

12 

253.3 

22 

46.7 


sensitive to the transmission condition and its order. However the computation 
time for the Fade strategy is greater than other strategies. On each subdomain, 
the non linearity is approximated by a fixed point procedure (see formula @). 
This fixed point procedure converges more slowly using the Fade strategy than 
the other strategies. This observation is also found in m- In conclusion, in 
the nonlinear case, we also think that the potential strategy of order 2 (S'q) is 
a good choice. 

Table 9: Comparison of transmission conditions for N = 2,10,100, iC = |u|2. At = 0.001, 
Ax = lOyb_ 




N 

= 2 

N 

= 10 

N = 

100 



Mter 

^total 

^iter 

^total 

Mter 

^total 

5*^ 

sg 

8 

909.5 

11 

229.1 

22 

40.6 

sg 

7 

802.1 

10 

205.8 

22 

41.6 


SI 

7 

802.3 

10 

205.6 

22 

41.4 


q2,20 

^2 

7 

1732.5 

10 

572.0 

22 

128.6 

qM 

*^2 

q2,50 

^2 

7 

4042.9 

10 

1342.3 

23 

310.3 


o 

o 

7 

7900.5 

10 

2640.0 

22 

576.0 


Remark 1. As we indicated previously, we explain here our choice of trans¬ 
mission condition: the potential strategy of order 2 (iSg). Indeed, it seems 
reasonable to consider it since 

1. the algorithm is robust and the computation time for Sq is similar to 
others transmission conditions. 
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2. if is not so large, it is one of the best choice, 

3. the implementation of Sq is much easier than other transmission condi¬ 
tions. 

ff.4- Gpu acceleration 

If the number of subdomain N is not so large, then solving the Schrodinger 
equation on subdomains takes most of the computation time. We move these 
computations from Cpu to Gpu. In this subsection, we present the numeri¬ 
cal experiments of Gpu acceleration. Two Gpu libraries of NVIDIA are used: 
GUSPARSE (tri-diagonal solver) and GUBLAS (BLAS operations). We use 8 
Gpu Kepler K20, and compare the Gpu and Gpu results for = 2,4,8. We 
use always 1 Gpu/MPI process. Gpu could accelerate a lot the computation as 


Table 10: Cpu and Gpu computation time, Bicgstab, 5 q, At = 0.001, Aa; = 10 ®, C = —x^. 


N 

2 

4 

8 

rpCpU 

774.4 

393.0 

203.2 

'J'GpU 

27.90 

16.13 

12.54 

'J'CpU ^rpQpu 

18 

24 

16 


shown in Table However the algorithm on Gpu is not scalable. The reason 
is that the size of problem is not large enough for Gpu. Gpu waste some of its 
ability. We test a larger case only for Gpu: At = 0.001, Aa; = 10“®. The results 
are shown in Table ni 


Table 11: Gpu computation time, Bicgstab, Sq, A = 0.001, Ax = 5 X 10 ®, V = —x^. 


N 

2 

4 

8 

'j'Gpu 

51.95 

28.21 

16.30 


Finally, we make the same tests for the nonlinear potential in the framework 
of the preconditioned algorithm. The results are presented in Table and 
Table [m The conclusion is similar. 


Table 12: Cpu and Gpu computation time. At = 0.01, Ax = 10 ®, V = \u\^. 


N 

2 

4 

8 

rpGpU 

373.6 

526.7 

316.0 

'j'Gpu 

73.9 

40.1 

34.0 

'J'CpU ^rpQpu 

5 

13 

9 
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Table 13: Gpu computation time, At = 0.01, Ax = 5 X 10 V = \u\^. 


N 

2 

4 

8 

'j'Gpu 

134.3 

73.7 

46.0 


7 . Conclusion 

We proposed in this paper a new algorithm of the SWR method for the one 
dimensional Schrodinger equation with time independent linear potential and a 
preconditioned algorithm for general potentials. The algorithms for both cases 
are scalable and could reduce significantly the computation time. Some newly 
constructed absorbing boundary conditions are used as the transmission condi¬ 
tion and compared numerically in the framework of the algorithms proposed by 
us. We believe that the potential strategy of order 2 is a good choice. Besides, 
we adapted the codes developed on Cpu to Gpu. According to the experiments, 
the computation could be accelerated obviously. 


Acknowledgements 

We acknowledge Pierre Kestener (Maison de la Simulation Saclay France) 
for the discussions about the parallel programming, especially for his help about 
Gpu acceleration. This work was partially supported by the French ANR grant 
ANR-12-MONU-0007-02 BEGASIM (Modeles Numeriques call). 

References 

References 

[1] L. Halpern, J. Szeftel, Optimized and quasi-optimal Schwarz waveform 
relaxation for the one dimensional Schrodinger equation. Math. Model. 
Methods Appl. Sci. 20 (12) (2010) 2167-2199. 

[2] L. Halpern, J. Szeftel, Optimized and quasi-optimal Schwarz waveform 
relaxation for the one-dimensional Schrodinger equation. Tech, rep., GNRS 
(2006). 

[3] F. Gaetano, M. J. Gander, L. Halpern, J. Szeftel, Schwarz waveform re¬ 
laxation algorithms for semilinear reaction-diffusion equations. Networks 
Heterog. Media 5 (3) (2010) 487-505. 

[4] M. J. Gander, L. Halpern, Optimized Schwarz Waveform Relaxation Meth¬ 
ods for Advection Reaction Diffusion Problems, SIAM J. Numer. Anal. 
45 (2) (2007) 666-697. 

[5] T. Hoang, J. Jaffre, G. Japhet, M. Kern, J. Roberts, Space-Time Do¬ 
main Decomposition Methods for Diffusion Problems in Mixed Formula¬ 
tions 51 (6) (2013) 3532-3559. 


29 




[6] M. J. Gander, L. Halpern, F. Nataf, Optimal Schwarz waveform relaxation 
for the one dimensional wave equation, SIAM J. Numer. Anal. 41 (5) (2003) 
1643-1681. 

[7] L. Halpern, J. Szeftel, Nonlinear nonoverlapping Schwarz waveform relax¬ 
ation for semilinear wave propagation, Math. Comput. 78 (266) (2009) 
865-889. 

[8] V. Dolean, M. J. Gander, L. Gerardo-Giorda, Optimized Schwarz Methods 
for Maxwell’s Equations, SIAM J. Sci. Gomput. 31 (3) (2009) 2193-2213. 

[9] X. Antoine, E. Lorin, A. D. Bandrauk, Domain Decomposition Methods 
and High-Order Absorbing Boundary Gonditions for the Numerical Simu¬ 
lation of the Time Dependent Schrodinger Equation with Ionization and 
Recombination by Intense Electric Field. 

[10] X. Antoine, G. Besse, S. Descombes, Artificial boundary conditions for one¬ 
dimensional cubic nonlinear Schrodinger equations, SIAM J. Numer. Anal. 
43 (6) (2006) 2272-2293. 

[11] X. Antoine, G. Besse, P. Klein, Absorbing boundary conditions for the one¬ 
dimensional Schrodinger equation with an exterior repulsive potential, J. 
Gomput. Phys. 228 (2) (2009) 312-335. 

[12] X. Antoine, G. Besse, P. Klein, Absorbing Boundary Gonditions for Gen¬ 
eral Nonlinear Schrodinger Equations, SIAM J. Sci. Gomput. 33 (2) (2011) 
1008-1033. 

[13] X. Antoine, G. Besse, J. Szeftel, Towards accurate artificial boundary con¬ 
ditions for nonlinear PDEs through examples, Gubo, A Math. J. 11 (4) 
(2009) 29-48. 

[14] P. Klein, Gonstruction et analyse de conditions aux limites artificielles pour 
des equations de Schrodinger avec potentiels et non linearites, Ph.D. thesis, 
Universite Henri Poincare, Nancy 1 (2010). 

[15] A. Duran, J. Sanz-Serna, The numerical integration of relative equilibrium 
solutions. The nonlinear Schrodinger equation, IMA J. Numer. Anal. 20 (2) 
(2000) 235-261. 

[16] Y. Saad, Iterative methods for sparse linear systems, 2nd Edition, Society 
for Industrial and Applied Mathematics, 2003. 

[17] S. Balay, M. F. Adams, J. Brown, P. Brune, K. Buschelman, V. Eijkhout, 
W. D. Gropp, D. Kaushik, M. G. Knepley, L. G. Mclnnes, K. Rupp, B. F. 
Smith, H. Zhang, PETSc Users Manual, Tech. Rep. ANL-95/11 - Revision 
3.4, Argonne National Laboratory (2013). 

[18] Message Passing Interface Forum, MPI: A Message-Passing Interface Stan¬ 
dard Version 3.0, Tech. rep. (2012). 


30 



[19] M. J. Gander, Schwarz methods over the course of time, Electron. Trans. 
Numer. Anal. 31 (2008) 228-255. 


31 



